智能论文笔记

Active Learning for Non-Parametric Choice Models

Fransisca Susan , Negin Golrezaei , Ehsan Emamjomeh-Zadeh , David Kempe

分类：机器学习 | (统计)机器学习

2022-08-05

我们研究了基于消费者的决策积极学习非参数选择模型的问题。我们提出一个负面结果，表明这种选择模型可能无法识别。为了克服可识别性问题，我们介绍了选择模型的有向无环图（DAG）表示，从某种意义上说，该模型可以捕获有关选择模型的更多信息，从而可以从理论上识别信息。然后，我们考虑在主动学习环境中学习与此DAG表示的近似的问题。我们设计了一种有效的主动学习算法，以估计非参数选择模型的DAG表示，该模型在多项式时间内运行时，当随机均匀地绘制频繁排名。我们的算法通过主动和反复提供各种项目并观察所选项目来了解最受欢迎的频繁偏好项目的分布。我们表明，与相应的非活动学习估计算法相比，我们的算法可以更好地恢复有关消费者偏好的合成和公开数据集的一组频繁偏好。这证明了我们的算法和主动学习方法的价值。

translated by 谷歌翻译

Plurality Veto: A Simple Voting Rule Achieving Optimal Metric Distortion

Fatih Erdem Kizilkaya , David Kempe

分类：人工智能

2022-06-14

该度量扭曲框架认为，n个选民和M候选人共同嵌入到公制空间中，因此选民对候选人的距离更高。投票规则的目的是仅鉴于排名，而不是实际距离，挑选了与选民总距离最小距离的候选人。结果，在最坏的情况下，每个确定性规则都选择了一个候选人，其总距离至少比最佳候选者大三倍，即至少有失真。最近的突破结果表明，达到3个边界可能但是，证明是非构造性的，投票规则本身是一个复杂的详尽搜索。我们的主要结果是一个非常简单的投票规则，称为多数否决权，它实现了相同的最佳失真为3。每个候选人的分数均以等于他的第一名选票数量的分数开始。然后，这些分数通过N回能的否决过程逐渐降低，在该过程中，当候选人得分达到零时，候选人会退出。一个接一个地，选民在常设候选人中降低了他们最不可能的选择，最后一位常设候选人赢得了胜利。我们给出一个单段证明，该投票规则实现了失真3.该规则也非常实用，它仅对每个选民进行两个疑问，因此它的沟通开销较低。我们还通过以下方式将多元化否决权否决为一类随机投票规则：复数否决仅针对k <n回合；然后，选择候选人的概率与他的剩余分数成正比。该一般规则在随机独裁统治（对于K = 0）和多数否决（对于K = N-1）之间，而K控制输出的方差。我们表明，对于所有k，该规则最多都会失真3。

translated by 谷歌翻译

Fairness in Ranking under Uncertainty

Ashudeep Singh , David Kempe , Thorsten Joachims

分类：机器学习

2021-07-14

公平性是在算法决策中的重要考虑因素。当具有较高优异的代理人获得比具有较低优点的试剂更差的代理人时，发生不公平。我们的中心点是，不公平的主要原因是不确定性。制定决策的主体或算法永远无法访问代理的真实优点，而是使用仅限于不完全预测优点的代理功能（例如，GPA，星形评级，推荐信）。这些都没有完全捕捉代理人的优点;然而，现有的方法主要基于观察到的特征和结果直接定义公平概念。我们的主要观点是明确地承认和模拟不确定性更为原则。观察到的特征的作用是产生代理商的优点的后部分布。我们使用这个观点来定义排名中近似公平的概念。我们称之为algorithm $ \ phi $ -fair（对于$ \ phi \ in [0,1] $）如果它具有以下所有代理商$ x $和所有$ k $：如果代理商$ x $最高$ k $代理以概率至少为$ \ rho $（根据后部优点分配），那么该算法将代理商在其排名中以概率排名，至少$ \ phi \ rho $。我们展示了如何计算最佳地互惠对校长进行近似公平性的排名。除了理论表征外，我们还提出了对模拟研究中的方法的潜在影响的实证分析。对于真实世界的验证，我们在纸质建议系统的背景下应用了这种方法，我们在KDD 2020会议上建立和界定。

translated by 谷歌翻译

Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning

Thanh Le-Cong , Duc-Minh Luong , Xuan Bach D. Le , David Lo , Nhat-Hoa Tran , Bui Quang-Huy , Quyet-Thang Huynh

分类：机器学习

2023-01-03

In this paper, we propose a novel technique, namely INVALIDATOR, to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. INVALIDATOR reasons about program semantic via program invariants while it also captures program syntax via language semantic learned from large code corpus using the pre-trained language model. Given a buggy program and the developer-patched program, INVALIDATOR infers likely invariants on both programs. Then, INVALIDATOR determines that a APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains errors behaviors of the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of INVALIDATOR is three-fold. First, INVALIDATOR is able to leverage both semantic and syntactic reasoning to enhance its discriminant capability. Second, INVALIDATOR does not require new test cases to be generated but instead only relies on the current test suite and uses invariant inference to generalize the behaviors of a program. Third, INVALIDATOR is fully automated. We have conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that INVALIDATOR correctly classified 79% overfitting patches, accounting for 23% more overfitting patches being detected by the best baseline. INVALIDATOR also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.

translated by 谷歌翻译

Conservation Tools: The Next Generation of Engineering--Biology Collaborations

Andrew Schulz , Cassie Shriver , Suzanne Stathatos , Benjamin Seleb , Emily Weigel , Young-Hui Chang , M. Saad Bhamla , David Hu , Joseph R. Mendelson III , .

分类：机器学习

2023-01-03

The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.

translated by 谷歌翻译

Posterior Collapse and Latent Variable Non-identifiability

Yixin Wang , David M. Blei , John P. Cunningham

分类： (统计)机器学习 | 机器学习

2023-01-02

Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.

translated by 谷歌翻译

Mapping smallholder cashew plantations to inform sustainable tree crop expansion in Benin

Leikun Yin , Rahul Ghosh , Chenxi Lin , David Hale , Christoph Weigl , James Obarowski , Junxiong Zhou , Jessica Till , Xiaowei Jia , Troy Mao

分类：计算机视觉 | 机器学习

2023-01-01

Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.

translated by 谷歌翻译

Morphology-based non-rigid registration of coronary computed tomography and intravascular images through virtual catheter path optimization

Karim Kadry , Abhishek Karmakar , Andreas Schuh , Kersten Peterson , Michiel Schaap , David Marlevi , Charles Taylor , Elazer Edelman , Farhad Nezami

分类：计算机视觉

2022-12-30

Coronary Computed Tomography Angiography (CCTA) provides information on the presence, extent, and severity of obstructive coronary artery disease. Large-scale clinical studies analyzing CCTA-derived metrics typically require ground-truth validation in the form of high-fidelity 3D intravascular imaging. However, manual rigid alignment of intravascular images to corresponding CCTA images is both time consuming and user-dependent. Moreover, intravascular modalities suffer from several non-rigid motion-induced distortions arising from distortions in the imaging catheter path. To address these issues, we here present a semi-automatic segmentation-based framework for both rigid and non-rigid matching of intravascular images to CCTA images. We formulate the problem in terms of finding the optimal \emph{virtual catheter path} that samples the CCTA data to recapitulate the coronary artery morphology found in the intravascular image. We validate our co-registration framework on a cohort of $n=40$ patients using bifurcation landmarks as ground truth for longitudinal and rotational registration. Our results indicate that our non-rigid registration significantly outperforms other co-registration approaches for luminal bifurcation alignment in both longitudinal (mean mismatch: 3.3 frames) and rotational directions (mean mismatch: 28.6 degrees). By providing a differentiable framework for automatic multi-modal intravascular data fusion, our developed co-registration modules significantly reduces the manual effort required to conduct large-scale multi-modal clinical studies while also providing a solid foundation for the development of machine learning-based co-registration approaches.

translated by 谷歌翻译

Controllable Mechanical-domain Energy Accumulators

Sung Y. Kim , David J. Braun

分类：机器人

2022-12-29

Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically do not afford high-fidelity control over the energy stored by the spring. Here, we present the design of a novel lockable compression spring that uses a small capstan clutch to passively lock a mechanical spring. The capstan clutch can lock up to 1000 N force at any arbitrary deflection, unlock the spring in less than 10 ms with a control force less than 1 % of the maximal spring force, and provide an 80 % energy storage and return efficiency (comparable to a highly efficient electric motor operated at constant nominal speed). By retaining the form factor of a regular spring while providing high-fidelity locking capability even under large spring forces, the proposed design could facilitate the development of energy-efficient spring-based actuators and robots.

translated by 谷歌翻译

Novel Spring Mechanism Enables Iterative Energy Accumulation under Force and Deformation Constraints

Cole A. Dempsey , David J. Braun

分类：机器人

2022-12-29

Springs can provide force at zero net energy cost by recycling negative mechanical work to benefit motor-driven robots or spring-augmented humans. However, humans have limited force and range of motion, and motors have a limited ability to produce force. These limits constrain how much energy a conventional spring can store and, consequently, how much assistance a spring can provide. In this paper, we introduce an approach to accumulating negative work in assistive springs over several motion cycles. We show that, by utilizing a novel floating spring mechanism, the weight of a human or robot can be used to iteratively increase spring compression, irrespective of the potential energy stored by the spring. Decoupling the force required to compress a spring from the energy stored by a spring advances prior works, and could enable spring-driven robots and humans to perform physically demanding tasks without the use of large actuators.

translated by 谷歌翻译